Finding Entities in Wikipedia Using Links and Categories

نویسندگان

  • Rianne Kaptein
  • Jaap Kamps
چکیده

In this paper we describe our participation in the INEX Entity Ranking track. We explored the relations between Wikipedia pages, categories and links. Our approach is to exploit both category and link information. Category information is used by calculating distances between document categories and target categories. Link information is used for relevance propagation and in the form of a document link prior. Both sources of information have value, but using category information leads to the biggest improvements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بهبود شناسایی موجودیت‌های نامدار فارسی با استفاده از کسره اضافه

Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...

متن کامل

Finding Entities or Information Using Annotations

User-generated content often provides more information than just textual data, i.e. tags or annotations are added to label data, in discussions users can comment on the data, and links exist not only between pages, but also between users and annotations. In this paper we explore the use of annotations in the form of categories to find entities or information in Wikipedia. Main differences betwe...

متن کامل

A Multilingual Entity Linker Using PageRank and Semantic Graphs

This paper describes HERD, a multilingual named entity recognizer and linker. HERD is based on the links in Wikipedia to resolve mappings between the entities and their different names, and Wikidata as a language-agnostic reference of entity identifiers. HERD extracts the mentions from text using a string matching engine and links them to entities with a combination of rules, PageRank, and feat...

متن کامل

Exploiting Locality of Wikipedia Links in Entity Ranking

Information retrieval from web and XML document collections is ever more focused on returning entities instead of web pages or XML elements. There are many research fields involving named entities; one such field is known as entity ranking, where one goal is to rank entities in response to a query supported with a short list of entity examples. In this paper, we describe our approach to ranking...

متن کامل

Generating a Large-Scale Entity Linking Dictionary from Wikipedia Link Structure and Article Text

Wikipedia has been increasingly used as a knowledge base for open-domain Named Entity Linking and Disambiguation. In this task, a dictionary with entity surface forms plays an important role in finding a set of candidate entities for the mentions in text. Existing dictionaries mostly rely on the Wikipedia link structure, like anchor texts, redirect links and disambiguation links. In this paper,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008